Skip to content

Hold this via WeakRef in IntersectionObserver callback#5821

Open
srid wants to merge 1 commit intoxtermjs:masterfrom
juspay:fix/intersection-observer-weakref
Open

Hold this via WeakRef in IntersectionObserver callback#5821
srid wants to merge 1 commit intoxtermjs:masterfrom
juspay:fix/intersection-observer-weakref

Conversation

@srid
Copy link
Copy Markdown
Contributor

@srid srid commented Apr 17, 2026

Summary

Fix for #5820. RenderService._registerIntersectionObserver creates an IntersectionObserver whose callback closes over this directly. Although _observerDisposable calls observer.disconnect() on dispose, the retained callback chain keeps this (RenderService) alive in practice — along with _coreService → _bufferService → buffers → BufferLine → Uint32Array cell data.

Wrap this in a WeakRef. While RenderService is alive, deref() returns it and the handler runs unchanged. Once no strong refs remain, the callback becomes a no-op and the BufferService graph can GC.

Evidence (from #5820)

Heap-snapshot diff across 30 mount/unmount cycles of 7 Terminal instances in a multi-tile app:

Class Δ count Δ bytes
native:system / JSArrayBufferData +175,594 +220 MB
object:Uint32Array +175,594 +10 MB
object:ArrayBuffer +175,594 +9 MB
object:BufferLine +175,594 +5 MB

Every retained Uint32Array traced to the same retainer chain — global IntersectionObserver registry → callback closure → RenderService → service graph → BufferLines. After the WeakRef fix, production-equivalent local testing shows Memory Footprint flat across 30 mode-toggles (was +367 MB/30-toggles before).

Why not just fix disconnect()?

Unclear why observer.disconnect() isn't releasing the callback in the observed environment — could be DevTools instrumentation, a Chrome extension patching window.IntersectionObserver, a native registry quirk, or something else. The WeakRef wrap is a defensive fix that breaks the retention chain regardless of the underlying cause and preserves functional semantics exactly.

Test

No behavioral change to validate beyond existing renderer-pause-on-visibility tests. A regression would manifest as the pause handler not firing when visibility changes; while RenderService is strongly rooted (which it is for any live terminal), deref() always returns the instance.

Closes #5820

srid added a commit to juspay/xterm.js that referenced this pull request Apr 17, 2026
Downstream kolu (juspay/kolu) consumes this fork via pnpm.overrides
github URL. The main fix/kolu-xterm-fixes branch is source-only;
this branch adds the pre-built .mjs bundles so no build step runs
at pnpm install time.

Built output contains both fixes:
  - xtermjs#5817: register CursorBlinkStateManager + _pausedResizeTask
  - xtermjs#5821: WeakRef this in IntersectionObserver callback (xtermjs#5820)
srid added a commit to juspay/kolu that referenced this pull request Apr 17, 2026
…fork

Bumps the pnpm override pointer from `fix/dispose-leaks-built` to
`fix/kolu-xterm-fixes-built`, which stacks a second fix on top of
the existing dispose-leaks patches:

- xtermjs/xterm.js#5817 (`fix/dispose-leaks`): register
  CursorBlinkStateManager + _pausedResizeTask disposables. Already
  in production.
- xtermjs/xterm.js#5821 (new, `fix/intersection-observer-weakref`):
  wrap `this` in a `WeakRef` inside `RenderService`'s IntersectionObserver
  callback. `observer.disconnect()` on our side wasn't releasing the
  callback in practice — heap snapshot showed 175,594 retained
  Uint32Arrays (~220 MB / 30 mode-toggles with 7 terminals) traced
  through the global IntersectionObserver registry → callback closure
  → RenderService → BufferService → BufferLines. WeakRef breaks that
  chain regardless of why the native registry held the callback.

Local measurement on kolu@zest (fresh tab, 30 canvas↔focus toggles,
7 terminals restored):

                                 Before fix  →  After fix
  Task Manager Memory Footprint    +367 MB   →   -3 MB
  BufferLine (Z1) instances Δ      +175,594  →   ~0

Upstream tracking: xtermjs/xterm.js#5820 (issue),
xtermjs/xterm.js#5821 (PR). Fork consumption will collapse to a plain
version bump once upstream merges + releases.
srid added a commit to juspay/kolu that referenced this pull request Apr 17, 2026
…fork

Bumps the pnpm override pointer from `fix/dispose-leaks-built` to
`fix/kolu-xterm-fixes-built`, which stacks a second fix on top of
the existing dispose-leaks patches:

- xtermjs/xterm.js#5817 (`fix/dispose-leaks`): register
  CursorBlinkStateManager + _pausedResizeTask disposables. Already
  in production.
- xtermjs/xterm.js#5821 (new, `fix/intersection-observer-weakref`):
  wrap `this` in a `WeakRef` inside `RenderService`'s IntersectionObserver
  callback. `observer.disconnect()` on our side wasn't releasing the
  callback in practice — heap snapshot showed 175,594 retained
  Uint32Arrays (~220 MB / 30 mode-toggles with 7 terminals) traced
  through the global IntersectionObserver registry → callback closure
  → RenderService → BufferService → BufferLines. WeakRef breaks that
  chain regardless of why the native registry held the callback.

Local measurement on kolu@zest (fresh tab, 30 canvas↔focus toggles,
7 terminals restored):

                                 Before fix  →  After fix
  Task Manager Memory Footprint    +367 MB   →   -3 MB
  BufferLine (Z1) instances Δ      +175,594  →   ~0

Upstream tracking: xtermjs/xterm.js#5820 (issue),
xtermjs/xterm.js#5821 (PR). Fork consumption will collapse to a plain
version bump once upstream merges + releases.
srid added a commit to juspay/kolu that referenced this pull request Apr 17, 2026
…gles) (#617)

## Summary

Fixes the canvas/focus-toggle memory leak that was pushing production
pureintent to 1.2 GB Memory Footprint. The leak is upstream in
`xterm.js`; this PR bumps the `pnpm.overrides` pointer to a fork branch
that stacks one additional fix on top of `fix/dispose-leaks-built`.

**Upstream tracking:** xtermjs/xterm.js#5820 (issue),
xtermjs/xterm.js#5821 (PR).

## The leak

`RenderService._registerIntersectionObserver` creates an
`IntersectionObserver` whose callback closes over `this` directly.
Although `_observerDisposable` calls `observer.disconnect()` on dispose,
the retained callback chain keeps `this` (RenderService) alive in
practice — along with `_coreService → _bufferService → buffers →
BufferLine → Uint32Array` cell data.

Heap-snapshot diff across 30 mount/unmount cycles of 7 `Terminal`
instances:

| Class | Δ count | Δ bytes |
|---|---|---|
| `native:system / JSArrayBufferData` | +175,594 | **+220 MB** |
| `object:Uint32Array` | +175,594 | +10 MB |
| `object:ArrayBuffer` | +175,594 | +9 MB |
| `object:BufferLine` | +175,594 | +5 MB |

Every retained `Uint32Array` traced through the same retainer signature:
global `IntersectionObserver` registry → callback closure →
`RenderService` → service graph → BufferLines. 175,594 = 30 toggles × 7
terminals × ~830 scrollback lines — basically the full buffer of every
disposed `Terminal` pinned past `terminal.dispose()`.

The fix wraps `this` in a `WeakRef`. Functional semantics preserved:
while RenderService is alive, `deref()` returns it and the handler runs
unchanged. Once no strong refs remain, the callback is a no-op and the
BufferService graph can GC.

## Ground-truth measurement

Local `kolu@zest` (fresh tab, 30 canvas↔focus toggles, 7 terminals
restored from session):

| Metric | Before | After |
|---|---|---|
| Memory Footprint Δ / 30 toggles | **+367 MB** | **−3 MB** |
| JS live Δ | +66 MB | ~0 |
| BufferLine instances retained | +175,594 | ~0 |

Also verified the `fix/dispose-leaks` fix (upstream #5817) still ships —
the stacked fork branch contains both.

## Alternatives considered

- **Chase why `disconnect()` isn't releasing the callback.** Could be
DevTools instrumentation, a Chrome extension patching
`window.IntersectionObserver`, a native registry quirk — unclear, and
likely environment-dependent. The WeakRef wrap is defensive and
preserves semantics regardless of root cause.
- **Null out `_coreService` / `_bufferService` in
`RenderService.dispose()`.** Narrower scope but requires a dispose
override. WeakRef is cleaner — no new dispose path, just a smaller
capture surface.

## Test plan

- [x] `just check` passes
- [x] Local fresh tab + 30 mode-toggles: Memory Footprint flat (verified
via Chrome Task Manager)
- [x] Deploy to `pureintent`, retest footprint delta on production
prod-like build
- [ ] Upstream xtermjs/xterm.js#5821 lands → drop fork override for this
specific fix in a follow-up

🤖 Diagnosis assisted by [Claude Code](https://claude.com/claude-code)
srid added a commit to juspay/kolu that referenced this pull request Apr 17, 2026
Blog post tracing the IntersectionObserver / BufferLine retention
story — wrong turn in #614, byte-delta heap diff as the pivot, the
one-line WeakRef patch in xtermjs/xterm.js#5821.

Written in the voice of the debugging-story canon: cold open on the
symptom, the wrong turns compressed rather than storied, anticlimactic
fix. Targets readers from backend / systems backgrounds who may not
be fluent in web-frontend memory tooling — links to MDN and Chrome
DevTools docs for the web-specific concepts.

Credit: I drove; Claude Code (https://claude.com/claude-code) did the
agent-side work.

Under docs/perf-investigations/ alongside the technical chapters in
memory-learnings.md.
srid added a commit to juspay/kolu that referenced this pull request Apr 18, 2026
Draft blog post for review. Push revisions as additional commits for
iteration.

Preview on GitHub by clicking the file in the diff, or view raw for a
cleaner markdown render:

-
[docs/perf-investigations/the-leak-that-wasnt-in-any-context.md](https://github.com/juspay/kolu/blob/docs/leak-blog-post/docs/perf-investigations/the-leak-that-wasnt-in-any-context.md)

Shape of the post:

- Intro — what Kolu is (cockpit for coding agents), canvas mode (linked
to [@sridca tweet](https://x.com/sridca/status/2044953014100726221)),
symptom
- Wrong turn — compressed, not storied: six commits chasing the
`Context` count proxy on #614, 89% reduction, zero Task Manager movement
- What I was actually measuring — brief explainer on Task Manager vs
`performance.memory` for readers who aren't web-perf fluent
- The one-line fix — byte-delta diff, retainer chain, WeakRef diff
- The xterm.js side — highlights both contributions
(xtermjs/xterm.js#5817 + xtermjs/xterm.js#5821) as "laughably small,
hours to find"
- What I'd tell past-me — three takeaways aimed at backend/systems
readers
- Links to master-branch docs/skill/scripts for anyone who wants the
full investigation record

Attribution: I drove; [Claude Code](https://claude.com/claude-code) did
the agent-side work.

Not for merge as-is — draft is to iterate on wording before deciding
where it's actually published.
Comment on lines +141 to 147
const weakSelf = new WeakRef(this);
const observer = new w.IntersectionObserver(
e => weakSelf.deref()?._handleIntersectionChange(e[e.length - 1]),
{ threshold: 0 }
);
observer.observe(screenElement);
this._observerDisposable.value = toDisposable(() => observer.disconnect());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this isn't handled by observer.disconnect() dropping the refs to the callback. Is there a way to solve this without the introduction of WeakRef?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this smells like a browser bug and might be better addressed in chrome directly. Should also be tested, whether it is reproducible on other browser engines.

Copy link
Copy Markdown
Member

@jerch jerch Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked the w3c spec - disconnect does indeed not nullify the callback, in fact the observer object holds onto the callback to be reusable with new .observe(some_element) calls. So this is indeed specced like this.

Would it help to just throw away the observer object itself for the GC to clean the callback up?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in observer = undefined?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yepp (and to make sure, the ref gets not bound elsewhere).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be preferable to a WeakRef, every time I've thought I needed it there was something simpler like this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed this to help too.

So, we can close this PR and choose #5831 instead.

The IntersectionObserver callback in RenderService closes over `this` directly.
On dispose, `_observerDisposable` calls `observer.disconnect()` — which in
theory releases the callback. In practice, across 30 mount/unmount cycles of
7 terminals in a multi-tile app, the retainer chain

    Window.IntersectionObserver (registry)
      -> callback closure
      -> RenderService (this)
      -> _coreService -> _bufferService -> buffers -> lines -> BufferLine -> Uint32Array

kept 175,594 BufferLine Uint32Arrays alive (+220 MB of native ArrayBuffer
bytes) past dispose. Whether this is a Chrome DevTools wrapper, an extension
patching window.IntersectionObserver, or something subtler in the native
registry — either way, `disconnect()` on our side wasn't enough to free the
callback in that environment.

Wrap `this` in a WeakRef so the callback holds only a weak reference.
Functional behavior is preserved: while RenderService is alive, `deref()`
returns it and the handler runs. Once no strong refs remain, the callback
becomes a no-op and the BufferService graph is free to GC — which is what
the outer `_observerDisposable` was supposed to guarantee but doesn't in
practice.

Refs xtermjs#5820
@jerch
Copy link
Copy Markdown
Member

jerch commented Apr 22, 2026

@srid In juspay/kolu#652 you write, that the callback retention is not reproducible locally with chrome-devtools MCP (shows always benign behavior), but with normal chrome + devtools it is and that only the weakref counters it effectively.

Can you please repeat the tests with devtools not opened, e.g. by adding the async snippet on toplevel and let it trigger the toggles by a timeout from normal load? then inspect the memory footprint again. Just want to make sure, that this is not a devtools instrumentation bug (devtools exhibits several of those, esp. during perf testing with screwed up timer events and overly big websocket load).

@srid
Copy link
Copy Markdown
Contributor Author

srid commented Apr 23, 2026

In juspay/kolu#652 you write, that the callback retention is not reproducible locally with chrome-devtools MCP (shows always benign behavior)

@jerch This was only because Kolu had already moved to an unified infinite canvas model (no more two-mode switching). And I had to go back in history to old model, and reproduce it there. Sharing results shortly, in the thread above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IntersectionObserver callback retains RenderService → BufferLines after dispose

3 participants